Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LMMS Memory Manager #1088

Merged
merged 28 commits into from
Nov 18, 2014
Merged

LMMS Memory Manager #1088

merged 28 commits into from
Nov 18, 2014

Conversation

diizy
Copy link
Contributor

@diizy diizy commented Aug 22, 2014

See post on mailing list for details.

Testing appreciated before merging.

@Sti2nd
Copy link
Contributor

Sti2nd commented Aug 25, 2014

I'll just link to the mailthread for easy finding later http://linux-multimedia-studio-lmms.996328.n3.nabble.com/LMMS-Memory-Manager-td10076.html

@diizy
Copy link
Contributor Author

diizy commented Aug 26, 2014

66 files changed and counting... this is building up to a pretty massive pull request. I might have to write a whole brief just to document all the changes to the codebase done here...

Note about the current state of the branch (in case someone is testing this): the vol/pan knobs on instruments are temporarily broken... not a bug, just unfinished business. Will be addressed before this branch gets merged.

@tresf
Copy link
Member

tresf commented Aug 27, 2014

Holler if you want a windows build. I feel bad I haven't been helping with testing lately, but I can at least fire off a build for our testers... :)

@diizy
Copy link
Contributor Author

diizy commented Aug 27, 2014

Nah, not just yet, see: broken knobs. But yeah I'll let you know, thanks.

@diizy
Copy link
Contributor Author

diizy commented Aug 27, 2014

@tresf, the vol/pan functionality is restored now.

@tresf
Copy link
Member

tresf commented Aug 27, 2014

Ok, I tried to build this, but it appears it still needs some work for the terribly fussy win32 world:

I remember with the SWH ladspa C files there were quite a few Win32 forward-declaration workarounds but that is just speculation.

Linking CXX shared module amplifier.dll
CMakeFiles/amplifier.dir/objects.a(Amplifier.obj): In function `Effect::operator new(unsigned long long)':
/home/ubuntu/lmms/include/Effect.h:42: undefined reference to `MemoryManager::alloc(unsigned long long)'
CMakeFiles/amplifier.dir/objects.a(Amplifier.obj): In function `configManager::operator new(unsigned long long)':
/home/ubuntu/lmms/include/config_mgr.h:52: undefined reference to `MemoryManager::alloc(unsigned long long)'
/home/ubuntu/lmms/include/config_mgr.h:52: undefined reference to `MemoryManager::alloc(unsigned long long)'
/home/ubuntu/lmms/include/config_mgr.h:52: undefined reference to `MemoryManager::alloc(unsigned long long)'
CMakeFiles/amplifier.dir/objects.a(Amplifier.obj): In function `Effect::operator delete(void*)':
/home/ubuntu/lmms/include/Effect.h:42: undefined reference to `MemoryManager::free(void*)'
CMakeFiles/amplifier.dir/objects.a(Amplifier.obj): In function `AutomatableModel::operator delete(void*)':
/home/ubuntu/lmms/include/AutomatableModel.h:69: undefined reference to `MemoryManager::free(void*)'
collect2: error: ld returned 1 exit status
make[2]: *** [plugins/Amplifier/amplifier.dll] Error 1
make[1]: *** [plugins/Amplifier/CMakeFiles/amplifier.dir/all] Error 2
make: *** [all] Error 2

@diizy
Copy link
Contributor Author

diizy commented Aug 27, 2014

On 08/27/2014 11:31 PM, Tres Finocchiaro wrote:

Ok, I tried to build this, but it appears it still needs some work for
the terribly fussy win32 world:

Well, someone needs to step up to deal with this, or the next version of
LMMS will be Linux/Mac only...

@tresf
Copy link
Member

tresf commented Aug 27, 2014

I spent some time on it to no avail. I'm pretty sure the offending line is here (MM_OPERATORS):
https://github.com/diizy/lmms/blob/3bb2c50f77b0bed909206226fbfe6aee1531c8d3/include/AudioPort.h#L39

@diizy
Copy link
Contributor Author

diizy commented Aug 27, 2014

That's just a macro for overriding new/delete operators. It's necessary to make the class use the MemoryManager instead of doing direct heap allocations. It's used in many other classes as well.

@tresf
Copy link
Member

tresf commented Aug 27, 2014

It's used in many other classes as well.

Yes, I see that. This issue only occurs during the linking process, which isn't uncommon from what I've seen with MinGW.

Without much knowledge of the macro realm, could you explain a bit to me so that I can research some of the techniques used against the internet of things?

@diizy
Copy link
Contributor Author

diizy commented Aug 28, 2014

On 08/28/2014 02:15 AM, Tres Finocchiaro wrote:

It's used in many other classes as well.

Yes, I see that. This issue only occurs during the linking process,
which isn't uncommon from what I've seen with MinGW.

Without much knowledge of the macro realm, could you explain a bit to
me so that I can research some of the techniques used against the
internet if things?

A macro is a command for the C PreProcessor (CPP, not to be confused
with CPlusPlus).

What it does is basically just string replacement. When you #define a
macro, that means that anywhere in the code that macro is found, it is
replaced with the text in the #define, it won't be checked for any
syntax or program logic, it's just straight-up text replacement. Macros
are really more of a C thing and often kind of frowned upon by C++ purists.

Here, the macro simply adds operator overrides to the classes where it
is invoked. It overrides the new and delete operators. Usually these
operators call malloc(), which allocates memory on the heap, and this is
the thing we want to avoid, so the macro overrides them to call
MemoryManager::alloc() instead, which doesn't allocate memory on the
heap, but instead fetches memory from the pre-allocated memory pools.

@tresf
Copy link
Member

tresf commented Aug 28, 2014

Ok thanks. Some of the mingw posts I read mentioned the preprocessor and how it can cause linking issues. I have to read more, but that puts me in the right direction.

@tresf
Copy link
Member

tresf commented Aug 28, 2014

So the preprocessor stuff I'm stumbling on warns of something refereed to as "name mangling". MinGW has a page dedicated to this. Unfortunately the majority of it is above my head. Here's the article...

http://www.mingw.org/wiki/Interoperability_of_Libraries_Created_by_Different_Compiler_Brands

Edit: ... and another article... http://www.transmissionzero.co.uk/computing/building-dlls-with-mingw/

Seems we use this approach in export.h

m_instrument( instrument )
{
}
InstrumentPlayHandle( Instrument * instrument, InstrumentTrack* instrumentTrack );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tobydox, should this be class EXPORT InstrumentPlayHandle ? It's the only way I could get compilation for Win32/64 to succeed but I'm just taking guesses here.

@tresf
Copy link
Member

tresf commented Aug 28, 2014

For anyone willing to test (@musikBear, @Sti2nd, @StakeoutPunch) here are windows builds of master with new MemoryManager.

https://github.com/tresf/lmms/releases/tag/v1.0.990

@Sti2nd
Copy link
Contributor

Sti2nd commented Aug 28, 2014

Played around in 64 bit, no bugs. A quick question, is the CPU meter fixed? My processor is at 22% at maxiumum,
image
and LMMS shows the CPU
image
almost full on unfa-Spoken. So it is clearly wrong...?

@tresf
Copy link
Member

tresf commented Aug 28, 2014

and LMMS shows the CPU almost full on unfa-Spoken. So it is clearly wrong...?

Looks fine over here. :)

image

@diizy
Copy link
Contributor Author

diizy commented Aug 28, 2014

On 08/28/2014 03:37 PM, Stian Jørgensrud wrote:

Played around in 64 bit, no bugs. A quick question, is the CPU meter
fixed? My processor is at 22% at maxiumum,
image
https://cloud.githubusercontent.com/assets/6373430/4075463/08765b02-2eb0-11e4-80cf-c91f83731626.png
and LMMS shows the CPU
image
https://cloud.githubusercontent.com/assets/6373430/4075453/f2dd8fb8-2eaf-11e4-920b-b1b9b113d96d.png
almost full on unfa-Spoken. So it is clearly wrong...?

Nope, it's not wrong. The LMMS CPU meter does not necessarily correlate
with CPU usage stats. LMMS shows the amount of CPU used from maximum
amount of CPU resources available to LMMS, based on how much time it
takes to render the period. For example, if your period size is set to
5ms, and rendering the period takes 2.5ms, then that translates to 50%
CPU usage.

@Sti2nd
Copy link
Contributor

Sti2nd commented Aug 28, 2014

Oooh

@exTunes
Copy link

exTunes commented Aug 28, 2014

I posted a bug here: #1108

@diizy
Copy link
Contributor Author

diizy commented Aug 29, 2014

Ok, did a sync and rebase, to get Toby's GUI-less rendering fix in.

So now it should be possible to benchmark this branch against the current master. I'd be very interested to know if there are any quantifiable performance benefits.

@tresf
Copy link
Member

tresf commented Aug 30, 2014

So this isn't exactly a benchmark, but I compared memory usage on Win7 x64:

Memory Usage - unfa-Spoken.mmpz

  • 1.0.3 - 573MB
  • 1.0.991 - 697MB

There're many reasons why this shouldn't be considered even a reasonably accurate benchmark: First - It compares an EOL branch to a Bleeding edge branch so there are way too many changes to make any argument that the MemoryManager changes have contributed to this. Second - Memory efficiencies shouldn't be measured by the memory overhead, but rather the engine efficiency as a whole (including spikes, io tomes, etc). However, despite all of this I still wanted to share some early findings. 🐰

@exTunes
Copy link

exTunes commented Aug 30, 2014

With the Memory Manager Branch, I can master a rendered track in a new file.
Usually LMMS crashes (with more than 7 GB RAM consumed) after playing back a sample track for one minute.. even without any effects!

@diizy
Copy link
Contributor Author

diizy commented Aug 31, 2014

On 08/30/2014 04:29 PM, Tres Finocchiaro wrote:

So this isn't exactly a benchmark, but I compared memory usage on Win7
x64:

Memory Usage - unfa-Spoken.mmpz

  • |1.0.3| - 573MB
  • |1.0.991| - 697MB

There's many reasons why this shouldn't be considered even a
reasonably accurate benchmark: First - It compares an EOL branch to a
Bleeding edge branch so there are way too many changes to make any
argument that the MemoryManager changes have contributed to this.
Second - Memory efficiencies shouldn't be measured by the memory
overhead, but rather the engine efficiency as a whole (including
spikes, io tomes, etc). However, despite all of this I swill wanted to
share.

Oh no, it's probably fairly accurate.

Memory usage /is /going to increase quite a bit with the MemoryManager -
this is because it allocates a huge chunk of memory at startup, which is
then used throughout the program... Also, the memory is then used as
"chunks" of 64 bytes, so every object's memory footprint gets rounded up
to the nearest 64 bytes, meaning that if you use the MM to allocate
memory for an object that is 65 bytes in size, it will allocate 128
bytes for it.

All of this causes some waste in memory, but here's the thing... I don't
really care about that. RAM is cheap these days, tradeoff of higher
memory usage -> better performance is one that I'm more than willing to
make.

@eagles051387
Copy link

What you mentioned about ram being cheap. I wonder if someone has an SSD a
chunk of memory so to speak can be allocated on the SSD

On Sun, Aug 31, 2014 at 8:11 AM, Vesa V notifications@github.com wrote:

On 08/30/2014 04:29 PM, Tres Finocchiaro wrote:

So this isn't exactly a benchmark, but I compared memory usage on Win7
x64:

Memory Usage - unfa-Spoken.mmpz

  • |1.0.3| - 573MB
  • |1.0.991| - 697MB

There's many reasons why this shouldn't be considered even a
reasonably accurate benchmark: First - It compares an EOL branch to a
Bleeding edge branch so there are way too many changes to make any
argument that the MemoryManager changes have contributed to this.
Second - Memory efficiencies shouldn't be measured by the memory
overhead, but rather the engine efficiency as a whole (including
spikes, io tomes, etc). However, despite all of this I swill wanted to
share.

Oh no, it's probably fairly accurate.

Memory usage /is /going to increase quite a bit with the MemoryManager -
this is because it allocates a huge chunk of memory at startup, which is
then used throughout the program... Also, the memory is then used as
"chunks" of 64 bytes, so every object's memory footprint gets rounded up
to the nearest 64 bytes, meaning that if you use the MM to allocate
memory for an object that is 65 bytes in size, it will allocate 128
bytes for it.

All of this causes some waste in memory, but here's the thing... I don't
really care about that. RAM is cheap these days, tradeoff of higher
memory usage -> better performance is one that I'm more than willing to
make.


Reply to this email directly or view it on GitHub
#1088 (comment).

Jonathan Aquilina

@diizy
Copy link
Contributor Author

diizy commented Aug 31, 2014

On 08/31/2014 09:24 AM, eagles051387 wrote:

What you mentioned about ram being cheap. I wonder if someone has an
SSD a
chunk of memory so to speak can be allocated on the SSD

That's for the OS to handle with swap partitions and such. We're not
going to do any swapping in the LMMS codebase, I can't see it being
beneficial in any way... we need quick memory access, and swapping seems
to me like something that would be horrible for RT-safety.

diizy and others added 11 commits November 18, 2014 13:58
Avoid crashes caused by worker threads accessing the buffer manager
before it is initialized. Therefore initialize it from within the
Mixer constructor which has the side effect that it gets initialized
in console-only rendering mode as well.
Samplebuffer: reload all samples when samplerate changes. This is because of the way LMMS uses samples: we always resample all samples t$
LadspaEffect: some safeguards for the non-inplacebroken plugins which use the same buffer for input and output. Theoretically, if some p$
FxMixer: fix effect processing in multichannel-chains
Add treshold knob to peak controller
This causes the peak controller to react only when the measured peaks are above the set treshold
Might be useful for finetuning your sidechains
@diizy
Copy link
Contributor Author

diizy commented Nov 18, 2014

So, I think I'm just going to merge this branch now. There's so many commits here, getting troublesome to keep this in sync with master, etc... so any performance issues can be profiled and fixed later - if we bother to work much more on 1.2 anymore, since it seems we could just skip right into 2.0...

@jasp00
Copy link
Member

jasp00 commented Aug 20, 2016

This memory manager should be dropped. I do not think it is better than the allocator in glibc. MemoryManager uses locks, while glibc's implementation is "without lock contention", according to the source. MemoryManager uses a slow algorithm to find free chunks.

The use of a custom manager should be centered on objects created by the mixer, and leave out other objects such as SampleBuffer. Either a more developed general purpose allocator should be used, TLSF could be an option, or a manager specific to LMMS' needs should be designed.

@midi-pascal
Copy link
Contributor

👍 for TLSF that I use in the top 1 commercial video games sound engine.

@jasp00
Copy link
Member

jasp00 commented Aug 21, 2016

I have been looking at TLSF and glibc malloc. They both use locks for multiple threads. Without locks, an inefficient allocation method is needed. One conclusion is that LocklessList is not lockless, because new and delete are on different threads.

Allocation methods that do not consider a maximum usage incur a memory pool extension, which affects real-time processing. General purpose allocators face fragmentation, which makes the pool extension more likely.

So the way to be lockless is to use structures with a maximum number of same-sized elements, something like m_items in MixerWorkerThread. Using bit arrays to detect free elements may help a little with efficiency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants